Masahiro FUKUI Masakazu TANAKA Masaharu IMAI
This paper proposes a new flexible hardware model for pipelined design optimization. Using together with an RTL floorplanner, the flexible hardware model makes accurate and fine design space exploration possible. It is quite effective for deep submicron technology since estimation at high level has become a difficult problem and the design tuning at lower level of abstraction makes up the full design optimization task. The experimental results show that our approach reduces the slack time in the pipeline stages then achieves higher performance with a smaller area.
Jun SATO Alauddin Y. ALOMARY Yoshimichi HONMA Takeharu NAKATA Akichika SHIOMI Nobuyuki HIKICHI Masaharu IMAI
This paper describes the current implementation and experimental results of a hardware/software codesign system for ASIP (Application Specific Integrated Processor) development: the PEAS-I System. The PEAS-I system accepts a set of application programs written in C language, associated data set, module database, and design constraints such as chip area and power consumption. The system then generates an optimized CPU core design in the form of an HDL as well as a set of application program development tools such as a C compiler, an assembler and a simulator. Another important feature of the PEAS-I system is that the system is able to give accurate estimations of chip area and performance before the detailed design of the ASIP is completed. According to the experimental results, the PEAS-I system has been found to be highly effective and efficient for ASIP development.
Jun-ichi ITO Takumi NAKANO Yoshinori TAKEUCHI Masaharu IMAI
This paper proposes a method to reduce the context switching time using a register bank to store contexts of working tasks. Hardware cost and performance were measured by modeling the register bank and controller in VHDL. Following results were obtained: (1) The controller can be implemented with a much smaller amount of hardware cost compared to that of the register bank, which is realized by SRAM module. (2) Context switching time can be reduced to less than 50% compared to that by software implementation. (3) Combination of the proposed architecture with our previous work (RTOS implemented in HW) gives us much higher performance of a hard real-time system.
Alauddin Y. ALOMARY Masaharu IMAI Jun SATO Nobuyuki HIKICHI
The performance of ASIPs (Application Specific Integrated Processors) is heavily affected by the design of their instruction set architecture. In order to maximize the performance of ASIP, it is essential to design an architecture that has an optimum instruction set. This paper descibes a new method that automates the design of optimum instruction set of ASIP. This method solves the Instruction set implementation Method Selection Problem(IMSP). IMSP is to be solved in the instruction set architecture design. Frse, the IMSP is formalized as an integer programming problem, which is to maximize the perfomance of the CPU under the constraints of chip area and power consumption. Then, a branch-and-bound algorithm to solve IMSP is described. According to the experimental results, the proposed algorithm is quite effective and efficient in solving the IMSP. The presented method automates a complex part of the ASIP chip design and is also a good design tool that enables designer to predict the performance of their design before completion.
Jun SATO Tsutomu KIMURA Masaharu IMAI Frank de SCHEPPER Kazuo YAMAZAKI Masashi NAGASE Shin-ichiro YAMAMOTO
This letter describes the architecture and ASIC implementation of the FSP-3 (Flexible Servo motor control Processor-3) chip. The FSP-3 is a special purpose 32 bit microprocessor dedicated to the Flexible Servo Control System (FSC), which is able to manipulate various kinds of servo motors efficiently. FSP-3 chip is one of the largest scale system ASICs entirely designed in Japanese universities.
Alauddin Y. ALOMARY Masaharu IMAI Nobuyuki HIKICHI
One of the most interesting and most analyzed aspects of the CPU design is the instruction set design. How many and which operations to be provided by hardware is one of the most fundamental issues relaing to the instruction set design. This paper describes a novel method that formulates the instruction set design of ASIP (an Application Specific Integrated Processor) using a combinatorial appoach. Starting with the whole set of all possible candidata instructions that represesnt a given application domain, this approach selects a subset that maximizes the performance under the constraints of chip area, power consumption, and functional module sharing relation among operations. This leads to the efficient implementation of the selected instructions. A branch-and-bound algorithm is used to solve this combinatorial optimization problem. This approach selects the most important instructions for a given application as well as optimizing the hardware resources that implement the selected instructions. This approach also enables designers to predict the perfomance of their design before implementing them, which is a quite important feature for producing a quality design in reasonable time.
Takuji HIEDA Hiroaki TANAKA Keishi SAKANUSHI Yoshinori TAKEUCHI Masaharu IMAI
Partial forwarding is a design method to place forwarding paths on a part of processor pipeline. Hardware cost of processor can be reduced without performance loss by partial forwarding. However, compiler with the instruction scheduler which considers partial forwarding structure of the target processor is required since conventional scheduling algorithm cannot make the most of partial forwarding structure. In this paper, we propose a heuristic instruction scheduling method for processors with partial forwarding structure. The proposed algorithm uses available distance to schedule instructions which are suitable for the target partial forwarding processor. Experimental results show that the proposed method generates near-optimal solutions in practical time and some of the optimized codes for partial forwarding processor run in the shortest time among the target processors. It also shows that the proposed method is superior to hazard detection unit.
Nguyen Ngoc BINH Masaharu IMAI Yoshinori TAKEUCHI
In designing ASIPs (Application Specific Integrated Processors), the papers investigated so far have almost focused on the optimization of the CPU core and did not pay enough attention to the optimization of the RAM and ROM sizes together. This paper overcomes this limitation and proposes an optimization algorithm to define the best ratio between the CPU core, RAM and ROM of an ASIP chip to achieve the highest performance while satisfying design constraints on the chip area. The partitioning problem is formalized as a combinatorial optimization problem that partitions the operations into hardware and software so that the performance of the designed ASIP is maximized under given chip area constraint, where the chip area includes the HW cost of the register file for a given application program with associated input data set. The optimization problem is parameterized so that it can be applied with different technologies to synthesize CPU cores, RAMs or ROMs. The experimental results show that the proposed algorithm is found to be effective and efficient.
Masaharu IMAI Hitoshi KITAZAWA
Nguyen Ngoc BINH Masaharu IMAI Akichika SHIOMI Nobuyuki HIKICHI
This paper proposes a new method to design an optimal pipelined instructions set processor for ASIP development using a formal HW/SW codesign methodology. First, a HW/SW partioning algorithm for selecting an optimal pipelined architecture is outlined. Then, an adaptive detabase approach is presented that enables to enhance the optimality of the design through very accurate estimation of the performance of a pipelined ASIP in the HW/SW partitioning process. The experimental results show that the proposed method is effective and efficient.
Makiko ITOH Yoshinori TAKEUCHI Masaharu IMAI Akichika SHIOMI
A synthesizable HDL generation method for pipelined processors is proposed. By using the proposed method, data-path and control logic descriptions of a target processor is generated from a clock based instruction set specification. From the experimental results, feasibility of the proposed method is evaluated and the amount of processor design time was drastically reduced than that of conventional RT level manual design in HDL.